Published in: Workshop on Information Theory and its Applications, 
San Diego, CA. February 2006 



Analysis of LDGM and compound codes for lossy 

compression and binning 



Emin Martinian 
Mitsubishi Electric Research Labs 
Cambridge, MA 02139, USA 
Email: martinian@meri.com 



Martin J. Wainwright 
Dept. of Statistics and Dept. of EECS, 
UC Berkeley, Berkeley, CA 94720 
Email: wain wrig@{eecs,stat}. berkeley.edu 



Abstract — Recent work has suggested that low-density gener- 
ator matrix (LDGM) codes are likely to be effective for lossy 
source coding problems. We derive rigorous upper bounds on 
the effective rate-distortion function of LDGM codes for the 
binary symmetric source, showing that they quickly approach 
the rate-distortion function as the degree increases. We also 
compare and contrast the standard LDGM construction with a 
compound LDPC/LDGM construction introduced in our previous 
work, which provably saturates the rate-distortion bound with 
finite degrees. Moreover, this compound construction can be 
used to generate nested codes that are simultaneously good 
as source and channel codes, and are hence well-suited to 
source/channel coding with side information. The sparse and 
high-girth graphical structure of our constructions render them 
well-suited to message-passing encoding. 

I. Introduction 

For channel coding problems, codes based on graphical con- 
structions, including turbo codes and low-density parity check 
(LDPC) codes, are widely used and well understood [16]. 
However, many communication problems involve aspects of 
quantization, or quantization in conjunction with channel 
coding. Well-known examples include lossy data compression, 
source coding with side information (the Wyner-Ziv problem), 
and channel coding with side information (the Gelfand-Pinsker 
problem). For such communication problems involving quanti- 
zation, the use of sparse graphical codes and message-passing 
algorithm is not yet as well understood. 

A standard approach to lossy compression is via trellis- 
code quantization (TCQ) [10], and various researchers have 
exploited it for single-source and distributed compression [2], 
[18] as well as information embedding problems [1], [8]. 
A limitation of TCQ-based approaches is the fact that sat- 
urating rate-distortion bounds requires increasing the trellis 
constraint length, which incurs exponential complexity (even 
for message-passing algorithms). It is thus of considerable 
interest to explore alternative sparse graphical codes for lossy 
compression and related problems. A number of researchers 
have suggested the use of LDGM codes for quantization 
problems [13], [17], [4], [15]. Focusing on binary erasure 
quantization (a special compression problem), Martinian and 
Yedidia [13] proved that LDGM codes combined with modi- 
fied message-passing can saturate the fundamental bound. A 
number of researchers have explored variants of the sum- 
product algorithm [15] or survey propagation algorithms [3], 



[17] for quantizing binary sources. Suitably designed degree 
distributions yield performance extremely close to the rate- 
distortion bound [17]. Various researchers have used tech- 
niques from statistical physics, including the cavity method 
and replica methods, to provide non-rigorous analyses of 
LDGM performance for source coding [3], [4], [15]. However, 
thus far, it is only in the limit of zero-distortion that this 
analysis has been made rigorous [6], [14], [5], [7]. 

In this paper, we begin in Section|Uby establishing rigorous 
upper bounds on the effective rate-distortion function of check- 
regular families of LDGM codes for all distortions D G 
[0, ^] under (maximum-likelihood) decoding. Our analysis is 
based on a combination of the second-moment method, a 
tool commonly used in analysis of satisfiability problems [6], 
[7], with standard large-deviation bounds. Our bounds show 
that LDGM codes can come very close to the rate-distortion 
lower bound. Although the residual gap vanishes rapidly as the 
check degrees are increased, it remains non-zero for any finite 
degree. In Section |lll| we discuss a LDPC/LDGM compound 
construction, which we introduced in previous work [11]. Here 
we provide a refined analysis of the fact that this compound 
construction can saturate the rate-distortion bound with finite 
degrees. We conclude in Section HV1 with a discussion of the 
extension of our constructions to source and channel coding 
with side information [12], as well as the application of 
practical message-passing algorithms [17]. 
Notation: Vectors/sequences are denoted in bold (e.g., s), 
random variables in sans serif font (e.g., s), and random vec- 
tors/sequences in bold sans serif (e.g., s). Similarly, matrixes 
are denoted using bold capital letters (e.g., G) and random 
matrixes with bold sans serif capitals (e.g., G). We use /(•; •), 
H{-), and _D(-||-) to denote mutual information, entropy, 
and relative entropy (Kullback-Leibler distance), respectively. 
Finally, we use card{ } to denote the cardinality of a set, 
I • lip to denote the p-norm of a vector, Bci{t) to denote a 
Bernoulli-^ distribution, and Hi, (t) to denote the entropy of a 
Ber(i) random variable. 

II. Bounds on standard LDGM constructions 

In this section, we begin by defining the check-regular 
LDGM ensemble. We then state and prove rigorous upper 
bounds on the effective rate-distortion function of this ensem- 
ble under ML encoding. 



A. Check-regular ensemble and lossy compression 

A low-density generator matrix (LDGM) code of rate 
R = ^ consists of a collection of n checks connected 
to a collection of m information bits; see Figure [2 for an 
illustration. The ensemble of LDGM codes that we study in 
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Fig. 1. Factor graph representation of an LDGM code with n 
checks (each associated with a source bit), and m information 
bits. The check-regular ensemble is formed by having each 
check choose 7t bit neighbors uniformly at random. 



this paper are constructed as follows: each check connects to 
7t information bits, chosen uniformly and at random from the 
set of m information bits. We use G e {0, 1}™^" to denote 
the resulting generator matrix; by construction, each column 
of G has exactly 7t ones, whereas each row (corresponding 
to a variable node) has an (approximately) Poisson number 
of ones. This construction, while not particular good from the 
coding perspective^ has been studied in both the satisfiability 
and statistical physics literatures [6], [14], [5], [7], where it 
is referred to as the "X-XORSAT" or "p-spin" model. An 
advantage of this regular-Poisson degree ensemble is that the 
resulting distribution of a random codeword is extremely easy 
to characterize: 

Lemma 1. Let G G {0,1}™^" be a random generator 
matrix obtained by randomly placing 7t ones per column. 
Then for any vector w G {0,1}™ with a fraction of v 
ones, the distribution of the corresponding codeword w G is 
Bernoulli( 5{v; 7t)) where 



(1) 



An LDGM code with generator matrix can be used to 
perform lossy data compression as follows. Given a source 
sequence y e {0,1}" drawn i.i.d. from a Ber(i) source, 
we use it to set the parities of the n checks at the 
top of Figure We then seek an optimal encoding of 
the source sequence by solving the optimization problem 
^(yjy) • = niin^gjo (i(z'G, y), where d{-,-) denotes the 
Hamming distortion. For a code of given rate R, we are 
interested in the expected minimum distortion ^¥,[d{Y , Y)] 
that can be achieved, where the expectation is taken over 
the Bernoulli source. For all distortions D g [0, the rate- 
distortion function is well-known to take the form R{D) = 

B. Theoretical results 

We begin by stating our main results on the rate-distortion 
performance of LDGM codes. For 5 £ (0, 1) and D,u € [0, 

' In particular, for bounded check degree ■yt , the Poisson degree distribution 
means that there are typically a constant fraction of isolated (degree zero) 
information bits. 
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Fig. 2. Plot of the function U{v-D, -yt) for D = 0.11 and 
7t = 4. For u = 0, we have [7(0; D; 7t) = 1 - Ht (D), so 
that the upper bound ^4) is always above the Shannon bound. 
The value niax^g[o,i] U {v; D, 7t) determines the excess rate 
required beyond the Shannon bound to achieve distortion D. 



define X*{d,D,u) = min{0, log/9*((5, D, u)}, where p* is the 
unique positive root^ of the quadratic equation Ax'^ + Bx + C 
with coefficients 

A = S{l-S){l-D) (2a) 
B ^ u{l~ + (1 - u)6^ -D[6^ + {1- 6f] .(2b) 
C = -D6{l-S). (2c) 

For (5 = 0, we set X*{0,D,u) = 0. Next define the function 
F[D,d] in a variational manner as follows 

max iiJb (u) - Hb (D) + ulog \(l - 6)6^" ^^'^'""^ + S 
tie[o,D] |_ L 

+ {l-u) log [5e^*(«.-D.") + {1-S)]-D X*{S, D, u) 



(3) 



With these definitions, we have: 



Theorem 1. The rate-distortion function of the jt-regular 
ensemble is upper bounded by 



i?upp;7t) 



max 

ve[o,i] 



l^Hb{D)+F[D;6{v;^t)] 
1 - Hb (v) 



(4) 



To provide some intuition for the behavior of the func- 
tion U{v; D, 7t) : = ^-H^iD)+W{vn.)] jj^^^ determines the 

bound 0, Figure 121 provides a plot^ for the case D = 0.11 
and 7t = 4. For v = 0,it can be seen that F [D; S{0; 7t)] — 0, 
so that the U{Q; D, 7t) = 1 — Ht (D), implying that the upper 
bound is always larger than the Shannon lower bound. 

By determining the maximum @ for a range of rates and 
degrees 7t, we can trace out parametric upper bounds on 
the rate-distortion function. Figure |3l provides plots of the 

^An explicit expression is p* = ^ ^ \/ B'^ — 4j4cj . 

^Note that for even yt, the function U is symmetric about i, so we only 
plot one half of the function. 
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Fig. 3. The Shannon rate-distortion function K{D) = 1 — 
Ht (D) provides a lower bound on any construction. Plots 
of the upper bound ^4) for LDGM ensembles with 7t G 
{3,4, 6}. 



bound (|4j on the rate-distortion function for 7t G {3,4,6}. 
Also shown is the Shannon curve K{D) = 1 — iJf, (D), which 
is a lower bound for any construction. Finally, an important 
special case of Theorem^is the limit of zero distortion (D = 
0), in which case the rate-distortion function corresponds to the 
satisfiability threshold. In this case, we recover as a corollary 
the following result previously established by Creignou et 
al. [6]: 

Corollary 1. The random ^t-XORSAT satisfiability threshold 



is lower bounded by a*(7t) : : 



i?up(0;7t) 



max 

t,e[o,i] 



flup(0;7t) 
1 + log [1 



, where 



1 - {v) 



(5) 



This special case reveals that our upper bounds are not 
sharp, as the bounds (|5} are known to be loose for the D = 
case. Indeed, several researchers [14], [5], [7] have derived the 
exact threshold values for the XORSAT problem. However, the 
looseness in the bound Q rapidly vanishes as 7t increases. 
As an illustration, for 7t = 3, we have a* (3) = 0.88949 in 
contrast to the exact threshold c*(3) = 0.91794, whereas for 
7t = 6, we have a* (6) = 0.99623 in contrast to the exact 
threshold c*(6) = 0.99738. 

C. Proof of Theorem 

The remainder of this section is devoted to proving the 
previous result. Our proof exploits Shepp's second moment 
method, which is a standard tool in satisfiability analysis: 

Lemma 2. For any positive integer valued random variable 
z, we have P[z > 0] > 

Given an LDGM code C of rate R = > 0, let = 2"^^ 
be the total number of codewords. For a given sequence s G 
{0, 1}", define for each codeword i = 1, . . . , TV an indicator 
variable Xi{C,s,D) for the event that codeword i is within 
Hamming distance Dn of the source sequence s. Thus, the 



N 



(6) 



is the total number of codewords that are distortion /^-optimal. 
In order to apply apply the second moment bound (Lemma |3 
to this random variable, we need to compute the first and 
second moments. Here we will be taking expectations over 
both the source sequence s ^ Ber(i) and the choice of 
random code C from the 7t -regular ensemble. In the following 
analysis, we will provide conditions such that 



logE[z (C, s, D)]^ - logE[z (C, s, D) 



> 



logg(n), 



where q{n) is a polynomial function of n. It can be shown [11] 
using martingale arguments that such a statement is suffi- 
cient to establish that the expected distortion is less than D. 
Consequently, we analyze normalized log probabilities (i.e., 
i logE[2: (C, s, D)]), and write o(l) to capture terms of the 
form l2i-iilil. The first moment is straightforward to bound 
using standard results: 

Lemma 3. The first moment is sandwiched as 

1 



E[z(C,s,i:>)] > 



1 



_2n[R-(l-fft.(D))] 



E[z(C,s,D)] < (n + 1) 2"P-(i--^^(^»l. 



(7a) 
(7b) 



We also make use of the following alternative expression for 
the second moment (see [11] for a proof): 

Lemma 4. The second moment can E[z(D)^] can be decom- 
posed as 



E[z{D)] + E[z{D)] 



[x,{D) = 1 I xo(i^) = 1]. (8) 



Particularly important in our analysis is the following lemma, 
which provides a large deviations upper bound on the condi- 
tional probability in equation (|S}: 

Lemma 5. Conditioned on the event that codeword j has a 
fraction vn ones, we have 

ilogP[x,p) = l|xo(i?) = l] < F[D:Siv;jt)]+oil), 

where the function F is defined in equation (|5Jl- 

Proof: We can reformulate the probability on the LHS as 
follows. Let T be a discrete variable with distribution 



\T = t) 



EDn (n\ 
s=0 



for i = 0, 1, 



^s=0 \sl 

representing the (random) number of Is in the source sequence 
s. Let Yi and Wj denote Bernoulli random variables with 
parameters 1 — (5(w;7t) and (5(u;7t) respectively. With this 
set-up, conditioned on codeword j having a fraction vn ones, 
the probability f[xj{D) = 1 | xo(I?) = 1] is equivalent to the 
probability that the random variable 



U 



n-T 



i=i j=i 



(9) 



is less than Dn. To bound this probability, we use Chernoff's 
bound in the form 

-\og¥\U < Dn] < inf ( -logMyfA) - XD] .(10) 
n A<o \n J 

We begin by computing the moment generating function My. 
Taking conditional expectations and using independence, we 
have 

Dn 

Mu(A) = ^P[r = i] [Mv(A)]*[Miv(A)]""* 
t=o 

Of interest to us is the exponential behavior of this expression 
in n. Using the standard entropy approximation to the binomial 
coefficient, we can write i log My (A) as 

- log( Vexp [n{Hb ( -) - Ht (D) + - logMy(A) 
n I ^-^ \n J n 

^ t=o ^ ' 



1 



logMM/(A)}]| +o(l) 

where the cumulant generating functions have the form 

logMy(A) = log[(l-(5)e^ + (5] . (11a) 



logMiv(A) = log[(l-(5)+<5e^] 



(lib) 



Note that the exponential behavior of the Chernoff bound ([TO} 
is determined by max„g[o.D] G{u\X) where 

G'(w; A) : = iJf, (m) - i7b (£>) + u log My ( A) 

+ (1 - u) logMM/(A) - \D. 

Since cumulant generating functions are strictly convex, we 
are guaranteed that G is strictly convex in A; similarly, it can 
be seen that G is strictly concave in u. Moreover, for any 
D > and 8 G (0, 1), we have G(u\ A) — > +oo as A ^ — oo. 
Thus, by standard min-max results [9], we can interchange the 
order of minimization (over A < 0) and maximization (over 
u G [0, 1?]). Taking derivatives with respect to A to find the 



minimum, we find that ^ = is equivalent to 

(l-(5)cxp(A) (5cxp(A) 



(1 -^)exp(A) + (5 



+ (!-«) 



(l-5) + (5exp(A) 



-D = 0. 



This is a quadratic equation in exp(A) with coefficients 
specified in equation i2al : the unique positive root is p* as 
defined. Finally, from the Chernoff bound ( I10> . we have 

-\og9\U <nD\ < sup G{u;\*{u;D)). 

Recognizing that F[D,d] = sup„g[o,D] ^("5 ^)) com- 
pletes the proof of the lemma. □ 
We are now ready to complete the proof of the theorem. 
First of all, by combining Lemmas |3] and |5] we can upper 
bound ^\ogE[zm E.^onXjiD) = 1 | Xo{D) = 1] by 

Ril-HbiD))+ max {RHb{v) + F[D,Siv;jt)]} ■ 

v£[0,l] 



Combining with Lemma|4] we obtain that - logE[z{D)'^] is 
upper bounded by 

R{1 -Hb{D))+ max {RHbiv) + F[D,5{vrft)]} + o{l). 

velos] 

Now plugging this bound into the second moment 
bound (Lemma |2} and using Lemma |3] we obtain that 
i logP[z(L') > 0] is lower bounded by 

Ril -Hb{D))- max {RHbiv) + F[D,5{vr/t)]} + o{l). 
ve[aA] 

The probability of finding a ZJ-optimal word will not vanish 
exponentially fast as long as this quantity stays non-negative; 
with some simple algebra, this condition is equivalent to the 
bound 

R > max^-^-(^)+^S^'^(-^^-)]. (12) 

Therefore, the true rate distortion function must be smaller 
than the RHS of this equation, thereby completing the proof 
of the theorem. □ 

D. Proof of Corollary^ 

To prove the corollary with Z? = 0, we note that equation (|9jl 
now entails evaluating the probability that — 0' 

where the Wj are i.i.d. Ber((5(f ; 7t)) variables. By Sanov's 
theorem, the error exponent (i.e., F) in this case is simply 
D (0||J(u;7t)) = log(l - (5(i;;7t)). Substituting this into 
equation and using the fact that Hb{0) = yields the 
result. 




Fig. 4. Illustration of compound LDGM and LDPC code 
construction. The top section consists of an (n, m) LDGM 
code with generator matrix G and constant check degrees 
7t — 4; its rate is R{G) = ^. The bottom section consists 
of a {m,k) LDPC codes with degree (71,, 7c) = (2,4), 
described by parity check matrix H and with rate -R(H) = 



III. Compound Constructions 

In this section, we describe a compound construction, dis- 
cussed in our previous work [11] in which an LDGM code 
is concatenated with an LDPC code. By contrast with the 
standard LDGM construction, finite degrees suffice to saturate 
the rate-distortion bound. The compound code construction 
is illustrated in Fig. |5 it is defined by a factor graph with 
three layers, and consists of an LDGM code with generator 
matrix G and an LDPC code with parity check matrix H. 
Note that a sequence y G {0,1}" is a codeword of this 
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Fig. 5. Plot of the function V{v; D, 7t) for 7t = 4, a regular 
LDPC with degrees (7„,7c) = (4,8), rates R{G) = 1 
and R(H.) ~ 0.5, and distortion D = 0.11. This function 
remains below 7? = 0.5 for all v, so that the code saturates 
the Shannon lower bound. 



joint LDPC/LDGM construction if and only if there exists 
an information sequence z e {0, 1}'" such that (a) z'G — y', 
and (b) Hz = (where all operations are in modulo two 
arithmetic). 

The major deficit of LDGM codes — from the point of 
view of both source and channel coding — is that they contain 
large numbers of poorly separated codewords. Herein lies the 
motivation for adding the bottom LDPC precode: it serves to 
push apart the valid information bit sequences z e {0,1}™, 
thereby spreading apart the associated sequences z'G that 
are codewords in the joint LDGM/LDPC construction. To 
formalize this intuition, a proof similar to that of Theorem ^ 
establishes the following 

Theorem 2. The rate-distortion function of jt-regular 
LDGM/LDPC compound construction (with asymptotic 
LDPC weight enumerator A{v)) is upper bounded by 
-Rcom(-D;7t) := max^g[o,i] V{v\D,-^t), where 



l-Hb{D) + F[D;6{v;-ft)] 
1-A{v)/R(H) 



(13) 



Note that this statement includes Theorem^as a special case, 
in which i?(H) = 1 and A{v) = Hi, (v). Of interest to us 
here is that these compound constructions (with i?(H) < 1) 
can saturate the rate-distortion bound with finite degrees. The 
key is that with suitable choice of LDPC degrees, we can 
ensure that A{v) is negative in a region around zero, which 
prevents the overshooting phenomenon illustrated in Figure |2l 
More specifically. Figure |5] illustrates the analogous plot for a 
joint LDGM/LDPC construction with 7t = 4, LDPC degrees 
(7f,7c) = (4,8), rates R{G) = 1 and i?(H) = 0.5, and 
distortion D = 0.11. Notice how this curve remains below 
R = 0.5 for all v £ [0,0.5], demonstrating that the upper 
bound il3\ meets the Shannon lower bound. 



IV. Discussion 

In concuiTent work [12], we have shown that the joint 
LDGM/LDPC construction in Figure 0] generates good nested 
constructions (i.e., a good channel code can be partitioned 
into good source codes, and vice versa), which can be shown 
to saturate the Wyner-Ziv and Gelfand-Pinsker bounds. We 
have also shown [17] that message-passing algorithms based 
on survey propagation [3], when applied to LDGM codes 
with suitable degree distributions, yield rate-distortion trade- 
offs very close to the Shannon bound. It remains to explore 
variants of such message-passing algorithms for the compound 
construction, and problems of coding with side information. 
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